we explore how combining LightThinker and Multi-Head Latent Attention cuts memory and boosts performance
This is one of the hottest topics thanks to DeepSeek. Learn with us: the core idea, its types, scaling laws, real-world cases and useful resources to dive deeper
We explore the power of datasets and their integration in Hugging Face's small language models family, particularly in SmolLM2.
we discuss how to enable the Mamba Selective State Space Model (SSM) to handle multimodal data using the Mixture-of-Transformers concept and modality-aware sparsity
We explore Google's and Microsoft's advancements that implement "chain" approaches for long context and multi-hop reasoning
We dive into test-time compute and discuss five+ open-source methods for its effective scaling for deep step-by-step models' reasoning.
Practical Insights for Large Language Models
World models are the next big thing that enables Physical AI. Let's explore how NVIDIA makes it happen
Plus a Video Interview with the SwiftKV Authors on Reducing LLM Inference Costs by up to 75%
We explore in details three RAG methods that address limitations of original RAG and meet the upcoming trends of the new year
Take some time to learn or refresh the key concepts, techniques, and models that matter most
Explore how RL can be blended with natural language